use power iteration for opnorm2 #49487

oscardssmith · 2023-04-24T19:02:24Z

This is much faster and allocates O(m) instead of O(m*n)

A  = rand(1024,1024);
# before
@time opnorm(A)
  0.148451 seconds (10 allocations: 8.210 MiB)
512.4395710760909

# after
@time opnorm(A)
  0.001369 seconds (3 allocations: 16.266 KiB)
512.439571076091

StefanKarpinski · 2023-04-24T19:13:09Z

Who would be good to review? Maybe @andreasnoack or he could recommend someone to review?

antoine-levitt · 2023-04-25T05:38:55Z

Clever! Of course it's better to use lanczos for this, but maybe as a quick attempt it's acceptable. The performance with this is very unpredictable though...

You should use randn rather than rand : randn is invariant under rotations and therefore picks points on the sphere uniformly, while rand is very biased: even if you shift it to cover (-1,1), it'll eg pick (1,1,...,1) sqrt(n) more often than (1,0,..., 0).

I'm concerned about the squaring of A, which can mean that even when the algorithm converges, having a small residual does not mean having a small error. Maybe doing the power method (or at least the residual check) on (0 A';A 0) is better? I'll think about it.

oscardssmith · 2023-04-25T11:49:45Z

yeah I did about 15 minutes of reading theory before choosing this algorithm and I knew that I wasn't picking the best method, but the others seemed more annoying to implement. for the performance unpredictably, do I get anything from the fact that I'm only relying on the eigenvalue rather than the eigenvector?

antoine-levitt · 2023-04-25T14:49:22Z

do I get anything from the fact that I'm only relying on the eigenvalue rather than the eigenvector?

Not really...

Sorry I didn't mean unnpredictable in the sense that the convergence is unpredictable. The power method is actually one of the most predictable methods in numerical analysis: the convergence rate is the ratio of second-largest to largest singular value squared. I just meant that for some matrices you're going to have a fast runtime, and for some others it's going to be very slow, so that might trip people up. This is especially true because you're running N iterations, so eg you could very well have a situation where the iteration doesn't converge for the matrix A, but it does for the matrix [A 0;0 A] (simply because it's run for two times more iterations), which would result in the former being much slower than the latter.

All in all I'm not completely convinced about this approach. opnorm is definitely not intended for "production" use (I remember the days when norm was doing opnorm, which was super annoying), it's more for when you want to check something quickly or for teaching purposes, so its speed doesn't really matter (and it's actually rather good that it's slow, because it encourages people not to use it, as it should). If you want this quickly you usually want some estimate of the opnorm, and it's reasonable to tell people to use a superior algorithm in one of the available libraries with their own tolerances. I'm also not 100% sure about the accuracy of this method (the 100eps() check seems a bit ad hoc), or the possibility that it'll get stuck into a local maximum.

Did you actually have a need for this, or was it just for fun?

oscardssmith · 2023-04-25T15:28:27Z

For the convergence issues, is there any way to determine if we are in the condition where this will converge badly from a small number of iterations?

This came from slack (https://julialang.slack.com/archives/C6A044SQH/p1682353744395679).

the 100eps() check seems a bit ad hoc)

Absolutely there definitely is something better to do there.

antoine-levitt · 2023-04-26T07:08:37Z

For the convergence issues, is there any way to determine if we are in the condition where this will converge badly from a small number of iterations?

If after a few iterations the convergence rate stabilises to something but the residual remains large), it probably means the algorithm has entered the asymptotic regime, and you can more or less predict how many iterations it's going to take by extrapolating using the convergence rate. But again that's a bit ad-hoc.

The options here are:

Merge this, maybe with a few tweaks (eg using the [0 A;A' 0] approach]
Implement a Lanczos method (I don't have a lot of experience with this, but for simply estimating the largest SVD there shouldn't be that many stability issues and probably a textbook implementation is fine)
Keep as it is, maybe adding a comment to the doc saying that p=2 computes a svd and is slow, and that users that want this to be efficient should call a lanczos algorithm on [0 A;A' 0]

Honestly I think option 3 is better here, because of the reasons above. I just looked at what other languages are doing: matlab appears to use max(svd()) for norm, and has a separate normest that does power iteration (even though they have eigs that implements a krylov method! The fact that they haven't bothered upgrading it probably means it's not very much used). Numpy and R do it by SVD.

ViralBShah · 2023-04-28T17:09:57Z

Also cc @stevengj

stevengj · 2023-04-29T12:39:49Z

stdlib/LinearAlgebra/src/generic.jl

+    tmp = zeros(Tnorm, m)
+    At = A'
+    v = one(Tnorm)
+    # this will converge quickly as long as the top two eigenvalues are distinct


Couldn't we use a Lanczos iteration to be more robust and converge quickly even if a few eigenvalues are similar in magnitude?

This paper seems to be a popular algorithm: https://doi.org/10.1137/04060593X and is even fancier (restarting, bidiagonalization).

good point. There's a python implementation of that paper with a good license here: https://github.com/bwlewis/irlbpy/blob/master/irlb/irlb.py. I'll see how it does.

good news and bad news. Good news is I have it working. Bad news is it takes about 84 lines of code. Performance testing to come.

stdlib/LinearAlgebra/src/generic.jl

antoine-levitt · 2023-04-29T19:01:10Z

About rand vs randn (I started replying to @stevengj 's deleted comment, I think the experiment below is interesting so I'm posting it anyway):
Rand underweighs the basis axes, and I thought this would be noticeable, but

function f(n, fun)
    x = fun(n)
    x /= norm(x)
    abs(x[1])
end
println(mean([f(10000, randn) for _ = 1:100]))
println(mean([f(10000, rand) for _ = 1:100]))

yields very similar results, so apparently it's not that bad (looks like a fun exercice in probability theory, at which I'm very bad...). So using rand is probably OK here, but since the perf impact is negligible, might as well take a rotationally invariant distribution...

stevengj · 2023-04-30T02:58:35Z

You basically want to avoid something that is nearly orthogonal to the dominant eigenvector, and the odds of that are about as negligible for rand as for randn, I think — i.e. the probability that abs(q' * rand()) ≤ ε or (q' * randn()) ≤ ε for a given small ε and an arbitrary unit vector q seem like they should be within a modest constant factor in the worst-case q (the former probability will depend on q because of the lack of rotational symmetry of rand). That's just a guess, though; it would be a nice little probability exercise to work this out precisely.

antoine-levitt · 2023-04-30T07:24:15Z

@stevengj yeah, I'd have guessed a sqrt(n) discrepancy seeing as the max bias for the rand distribution is of that order, but I have a very weak intuition for high dimensional probability...

So, I'm still on the fence about this pr. First, having the same function do two very different things (svd or iterative method) feels weird (do we already have an example of that? Usual polyalgorithms simply dispatch between different exact factorizations). Second, iterative methods are more suited to sparse matrices, so do we want this function to also work for sparse? But then do we want to fallback to svd? Also it'd be good for this function to take as parameter a tolerance, for early exit of the method. All in all this points to having a second method besides the current opnorm, like matlab's normest. And third, it does look a bit specialized for LinearAlgebra, and there are already packages doing lanczos, so maybe it's better to just make sure there's a good function in one of the packages, and then point to it in the docs?

stevengj · 2023-04-30T21:07:56Z

Second, iterative methods are more suited to sparse matrices, so do we want this function to also work for sparse? But then do we want to fallback to svd?

We could have an opnorm_estimate function that takes a tolerance, suitable for sparse matrices. But have opnorm call opnorm_estimate with a machine-precision tolerance and a dense fallback if needed.

aravindh-krishnamoorthy · 2023-05-02T21:53:23Z

Glad that you found some optimization hotspots :) The tough part seems to be finding the right initial x and the number of iterations niter. Seems like for niter = n, with the current method, small to medium sized matrices suffer in performance? Also:

julia> opnorm2(complex.(randn(2,2),randn(2,2)))
ERROR: InexactError: Float64(1.894212444018611 - 0.6331930071371694im)
[...]

I added a comment to this effect in the code. Have I got this wrong? Also, wasn't there some BLAS/LAPACK code for power method and related iterative methods?

aravindh-krishnamoorthy · 2023-05-02T22:28:24Z

Sorry, but unfortunately, I would also like to report that somehow the benchmarks on my system are different. The application is very interesting; I was attempting to figure out what the impact on small to medium sized matrices might be, since (likely) convergence will fail and they'll fall back to svdvals. I use the following code in opnorm.jl:

# Uses power iteration to compute the maximal singular value.
# falls back to svdvals if it runs into numerics issues.
function opnorm2(A::AbstractMatrix{T}) where T
    LinearAlgebra.require_one_based_indexing(A)
    m,n = size(A)
    Tnorm = typeof(float(real(zero(T))))
    if m == 0 || n == 0 return zero(Tnorm) end
    if m == 1 || n == 1 return norm2(A) end
    # to minimize the chance of x being orthogonal to the largest eigenvector
    x = randn(Tnorm, n)
    tmp = zeros(Tnorm, m)
    At = A'
    v = one(Tnorm)
    # this will converge quickly as long as the top two eigenvalues are distinct
    for i in 1:n
        mul!(x, At, mul!(tmp, A, x))
        newv = norm(x)
        # the numerics got very wonky
        !isfinite(newv) && return first(svdvals(A))
        isapprox(v, newv; rtol=10*eps(Tnorm)) && return sqrt(newv)
        v = newv
        x .*= inv(newv)
    end
    # iteration failed to converge
    return first(svdvals(A))
end

Benchmarks on my system:

julia> using LinearAlgebra
julia> using BenchmarkTools
julia> include("opnorm.jl")
opnorm2 (generic function with 1 method)
julia> A = randn(1024,1024) ;
julia> methods(opnorm2)
# 1 method for generic function "opnorm2" from Main:
 [1] opnorm2(A::AbstractMatrix{T}) where T
     @ ~/opnorm.jl:3
julia> @benchmark opnorm2(A)
BenchmarkTools.Trial: 16 samples with 1 evaluation.
 Range (min … max):  145.561 ms … 395.863 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     338.414 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   330.935 ms ±  52.644 ms  ┊ GC (mean ± σ):  0.01% ± 0.03%
[...]
 Memory estimate: 16.27 KiB, allocs estimate: 3.

julia> methods(LinearAlgebra.opnorm2)
# 1 method for generic function "opnorm2" from LinearAlgebra:
 [1] opnorm2(A::AbstractMatrix{T}) where T
     @ ~/julia/julia/usr/share/julia/stdlib/v1.10/LinearAlgebra/src/generic.jl:673
julia> @benchmark LinearAlgebra.opnorm2(A)
BenchmarkTools.Trial: 34 samples with 1 evaluation.
 Range (min … max):  129.942 ms … 164.007 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     151.122 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   148.022 ms ±  10.197 ms  ┊ GC (mean ± σ):  0.05% ± 0.12%
[...]
Memory estimate: 8.59 MiB, allocs estimate: 10.

Also the numerical issues:

julia> LinearAlgebra.opnorm2(complex.(randn(2,2),randn(2,2)))
1.8431336899908002

julia> opnorm2(complex.(randn(2,2),randn(2,2)))
ERROR: InexactError: Float64(0.041218366090337966 + 0.0144244121071116im)
[...]

stdlib/LinearAlgebra/src/generic.jl

oscardssmith · 2023-05-04T04:02:02Z

I've now pushed some improvements that fix the squaring of the condition number (turns out you don't need the sqrts since this problem is self dual (A' and A have the same norm, so this procedure finds the right and left singular vectors at the same time both of which have the same singular value). The return condition is still pretty bad. For reference, here is the Lanczos version. It's a lot uglier but it does work pretty well.

function irlb(A::Matrix{T}, nu; tol=eps(T), maxit=10000) where T
    m, n   = size(A)
    if min(m,n) < 2
        S = svd(A)
        return (S.U,S.S,S.Vt,0)
    end
    
    m_b  = min(nu+7, n)  # Working dimension size
    it   = 0
    j    = 1
    k    = nu
    smax = T(-Inf)
    local fn
    local S

    V = zeros(T, n, m_b)
    W = zeros(T, m, m_b)
    F = zeros(T, n)
    B = zeros(T, m_b, m_b)
    tmp = zeros(T, m_b)

    V[:,1] .= randn(T, n) # Initial vector
    V[:,1] .*= inv(norm(V[:,1]))

    @views while it < maxit
        it > 0 && (j = k)
        mul!(W[:,j], A, V[:,j])
        if it>0
            mul!(tmp[1:j-1], W[:,1:j-1]', W[:,j])
            mul!(W[:,j], W[:,1:j-1], tmp[1:j-1], -1, 1) # W[:,j]-=W[:,1:j-1]*tmp[1:j-1]
        end
        s = norm(W[:,j])
        W[:,j] .*= inv(s)
        # Lanczos process
        while j<=m_b
            mul!(F, A', W[:,j])
            mul!(F, s, V[:,j], -1, 1) # F -= s*V[:.j]
            mul!(tmp[1:j], V[:,1:j]', F)
            mul!(F, V[:,1:j], tmp[1:j], -1, 1)
            fn = norm(F)
            F  .*= inv(fn)
            B[j,j] = s
            if j<m_b
                V[:,j+1] .= F
                B[j,j+1] = fn 
                mul!(W[:,j+1], A, V[:,j+1])
                # One step of classical Gram-Schmidt...
                mul!(W[:,j+1], fn, W[:,j], -1, 1)
                # ...with full reorthogonalization
                mul!(tmp[1:j], W[:,1:j]', W[:,j+1])
                mul!(W[:,j+1], W[:,1:j], tmp[1:j], -1, 1) # W[:,j+1] -= W[:,1:j]*tmp[1:j]
                s = norm(W[:,j+1])
                W[:,j+1] .*= inv(s)
            end
            j+=1
        end
        # End of Lanczos process
        S    = svd!(B)
        R    = fn .* S.U[end,:] # Residuals
        smax = max(S.V[1], smax)
        conv = sum(abs.(R[1:nu+1]) .< tol*smax)
        if(conv < nu)  # Not coverged yet
            k = max(conv+nu+1, k)
            k = min(k, m_b-1)
        else
            break
        end
        # Update the Ritz vectors
        V[:,1:k] .= V[:,1:m_b]*S.Vt[:,1:k]
        V[:,k] .= F 
        B .= 0
        B[diagind(B)[1:k]] .= S.S[1:k]
        B[1:k,k] .= R[1:k]
        # Update the left approximate singular vectors
        mul!(W[:,1:k], W, S.U[:,1:k])
        it += 1
    end

    U = W[:,1:m_b]*S.U[:,1:nu]
    V = V[:,1:m_b]*S.Vt[:,1:nu]
    return (U,S.S[1:nu],V,it)
end

aravindh-krishnamoorthy · 2023-05-04T08:09:37Z

Hello, I'm a new guy here and I don't mean to be disrespectful to anyone.

Power algorithm has its place, but it's more a learning tool as a first step into numerical algorithms. There's a range of inputs it works well on, but I feel that it has no place in stdlib and definitely not for ::AbstractMatrix{T}. Furthermore, a fall-back to svdvals lacking convergence does not seem to be a good idea and points to the fact that we are probably using a wrong approach. Although several numerically stable variants are available, each of them addresses a specific data type and is not generic enough.

Since a lot of good work has been done in this PR, I'd suggest the follows:

Use opnorm2 based on svdvals or LAPACK/BLAS as done today.
In the documentation, write a note that opnorm2 may be slow and can be speeded up for specific instances using power iterations. Then, provide a curated form of this code removing the fallbacks, etc.

This is analogous to the approach taken by BLAS/LAPACK wrt power iterations. Furthermore, the method is simple and straightforward enough for a user to implement and experiment with it for their specific needs.

DilumAluthge · 2025-01-12T19:16:23Z

We have moved the LinearAlgebra stdlib to an external repo: https://github.com/JuliaLang/LinearAlgebra.jl

@oscardssmith If you think that this PR is still relevant, please open a new PR on the LinearAlgebra.jl repo.

oscardssmith added 2 commits April 24, 2023 14:57

use power iteration for opnorm2

9cabd70

minor type improvement

0a8b9e2

oscardssmith added performance Must go faster linear algebra Linear algebra labels Apr 24, 2023

StefanKarpinski requested a review from antoine-levitt April 25, 2023 14:52

Nonsquare typo and use randn

c798902

stevengj reviewed Apr 29, 2023

View reviewed changes

stdlib/LinearAlgebra/src/generic.jl Outdated Show resolved Hide resolved

stevengj reviewed May 3, 2023

View reviewed changes

stdlib/LinearAlgebra/src/generic.jl Outdated Show resolved Hide resolved

improvements

d885aba

oscardssmith mentioned this pull request May 19, 2023

Function to return any eigenvector JuliaLang/LinearAlgebra.jl#1009

Open

DilumAluthge closed this Jan 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use power iteration for opnorm2 #49487

use power iteration for opnorm2 #49487

oscardssmith commented Apr 24, 2023

StefanKarpinski commented Apr 24, 2023

antoine-levitt commented Apr 25, 2023

oscardssmith commented Apr 25, 2023

antoine-levitt commented Apr 25, 2023

oscardssmith commented Apr 25, 2023

antoine-levitt commented Apr 26, 2023

ViralBShah commented Apr 28, 2023

stevengj Apr 29, 2023

stevengj Apr 29, 2023 •

edited

Loading

oscardssmith Apr 29, 2023

oscardssmith Apr 30, 2023

antoine-levitt commented Apr 29, 2023

stevengj commented Apr 30, 2023

antoine-levitt commented Apr 30, 2023

stevengj commented Apr 30, 2023

aravindh-krishnamoorthy commented May 2, 2023 •

edited

Loading

aravindh-krishnamoorthy commented May 2, 2023 •

edited

Loading

oscardssmith commented May 4, 2023

aravindh-krishnamoorthy commented May 4, 2023

DilumAluthge commented Jan 12, 2025

use power iteration for opnorm2 #49487

use power iteration for opnorm2 #49487

Conversation

oscardssmith commented Apr 24, 2023

StefanKarpinski commented Apr 24, 2023

antoine-levitt commented Apr 25, 2023

oscardssmith commented Apr 25, 2023

antoine-levitt commented Apr 25, 2023

oscardssmith commented Apr 25, 2023

antoine-levitt commented Apr 26, 2023

ViralBShah commented Apr 28, 2023

stevengj Apr 29, 2023

Choose a reason for hiding this comment

stevengj Apr 29, 2023 • edited Loading

Choose a reason for hiding this comment

oscardssmith Apr 29, 2023

Choose a reason for hiding this comment

oscardssmith Apr 30, 2023

Choose a reason for hiding this comment

antoine-levitt commented Apr 29, 2023

stevengj commented Apr 30, 2023

antoine-levitt commented Apr 30, 2023

stevengj commented Apr 30, 2023

aravindh-krishnamoorthy commented May 2, 2023 • edited Loading

aravindh-krishnamoorthy commented May 2, 2023 • edited Loading

oscardssmith commented May 4, 2023

aravindh-krishnamoorthy commented May 4, 2023

DilumAluthge commented Jan 12, 2025

stevengj Apr 29, 2023 •

edited

Loading

aravindh-krishnamoorthy commented May 2, 2023 •

edited

Loading

aravindh-krishnamoorthy commented May 2, 2023 •

edited

Loading